Skip to content

Direct JSON writer + class pickle cache (R3-R4)#7

Merged
jensens merged 4 commits intomainfrom
feature/direct-json-writer
Feb 25, 2026
Merged

Direct JSON writer + class pickle cache (R3-R4)#7
jensens merged 4 commits intomainfrom
feature/direct-json-writer

Conversation

@jensens
Copy link
Member

@jensens jensens commented Feb 25, 2026

Summary

  • Direct PickleValue → JSON string writer (R3): Eliminates all serde_json::Value intermediate allocations in the PG decode path. Writes JSON tokens directly from the Rust AST to a String buffer with the GIL released. PG path is 1.3-3.3x faster than dict+json.dumps(), wide_dict -55%, FileStorage pipeline 1.4x faster at median.

  • Class pickle cache (R4): Thread-local cache of class pickle bytes per (module, name) pair. With ~6 distinct classes in a typical ZODB database, replaces 7 opcode writes with a single memcpy on ~99.6% of records. FileStorage encode -2 to -4%.

  • Benchmark docs: All numbers updated to R4+PGO. PGO is now the standard build for benchmarking. Added decompression step for Data.fs.gz.

Key numbers (R4+PGO vs CPython pickle)

Path Metric Result
Encode (synthetic) vs Python 1.7-9.2x faster
Decode (synthetic) vs Python 1.0-2.3x faster
PG JSON string vs dict+dumps 1.3-3.3x faster
FileStorage encode median 5.0x faster (4.0 µs vs 19.9 µs)
FileStorage PG pipeline median 1.4x faster (24.4 µs vs 34.7 µs)
Plone page (50 objects) codec overhead ~1.4 ms total

Test plan

  • 198 Rust tests pass (cargo test)
  • 180 Python integration tests pass (pytest tests/)
  • 61 new comparison tests verify byte-for-byte equivalence between old (serde_json) and new (direct writer) paths
  • 2 new tests verify class pickle cache correctness
  • PGO build + full benchmark suite (synthetic + FileStorage + pg-compare)

🤖 Generated with Claude Code

jensens and others added 4 commits February 25, 2026 01:14
Eliminates serde_json::Value intermediate allocations in the PG decode
path (decode_zodb_record_for_pg_json). The new pipeline writes JSON
tokens directly from the PickleValue AST to a String buffer in Rust
with the GIL released, replacing the two-step allocate-then-serialize
approach.

Key changes:
- json_writer.rs: JsonWriter with fast-path string escaping, ryu floats
- json.rs: pickle_value_to_json_string_pg() recursive direct writer
- known_types.rs: try_write_reduce_typed/try_write_instance_typed
- btrees.rs: btree_state_to_json_writer() for all BTree variants
- Thread-local JSON buffer reuse (same pattern as encode ENCODE_BUF)

PG path speedup: 1.3-3.3x faster than dict+json.dumps(), wide_dict -55%.
FileStorage pipeline: 1.4x faster at median (28.3 vs 40.4 µs/record).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread-local Vec cache avoids re-encoding identical class pickles for
every ZODB record. With ~6 distinct classes in a typical database, the
cache hits ~99.6% after warmup, replacing 7 opcode writes with a single
memcpy of ~50 bytes.

Uses linear search (faster than HashMap for ~6 entries, avoids string
allocation on cache hits). Extracts build_class_pickle() pub(crate)
helper reused by both production and test encode paths.

FileStorage encode: -2 to -4% (mean 4.9→4.8, median 4.1→4.0 µs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- BENCHMARKS.md: updated all numbers to R4+PGO, PGO as standard build
- PERF_REPORT_ROUND3.md: direct JSON writer results (-55% wide_dict)
- PERF_REPORT_ROUND4.md: class pickle cache results (-2 to -4% FS)
- PERF_REPORT_COMPOUND.md: cumulative R1-R4 comparison

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jensens jensens merged commit edaf55c into main Feb 25, 2026
6 checks passed
@jensens jensens deleted the feature/direct-json-writer branch February 25, 2026 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant